 |
|
 |
Subject: Cluster out of sync, but AI, stats indicate OK |
 |
 |
 |
Product Area: Domino Server |
 |
Technical Area: Administration |
 |
Platform: Windows |
 |
Release: 8.5.2 |
 |
Reproducible: Intermittent |
 |
 |
 |
 |
We have 5 servers in a cluster, with 6 notes databases in a folder that cluster, and have connection docs with replication of databases as back up to cluster manager.
Servers in same physical location, Gb connection.
We have 1 "hub" server that initiates the replications (every 60 mins) with the other "spoke" cluster mates, this has been done to offload the replication load to "hub" cluster mate. Clustering and replication, until a week ago, has been no problem, but 1 server in the cluster is now getting out of sync, and replication times have risen dramatically, but only transferring (according to rep docs) a tiny amount of data (only a few docs, few Ks).
We have group A users on 1 serverA, and group B users on serverB. All access is via HTTP.
We had an issue a few weeks ago, where performance degraded dramatically and replication times increased dramatically for one server, DDM and the replication logs showed that there were a few corrupt docs or over 32K. We deleted the docs, and issue instantly went away.
We thought this might be issue, but there are no reported issues anywhere that this might be issue.
The cluster stats: replica.cluster.secondsonqueue (avg = 5, Max 13, I realise single digits for both is best) and workqueuedepth 5, I believe are within acceptable tolerance.
Availability Index of servers in cluster is good.
We cleared Rep History of DBs, no difference.
So cluster is out of sycn, replication is taking very long time, only for 1 server in the cluster, all others are in sync and cluster almost immediately.
I am about to run fixup, although I believe this does not cure docs with the 32K problem. If we do have the 32K problem, we have no NoteIDs in DDM or rep docs, so not sure what to do.
Close to stage of either just deleting all DBs and replicating them afresh or just rebuilding server.
All comments\ suggestions\ feedback gladly welcomed.
Nick
 
Feedback number WEBB8MXT7H created by ~Holly Quetfookony on 10/24/2011

Status: Open
Comments:

Cluster out of sync, but AI, stats ... (~Holly Quetfook... 24.Oct.11)
. . Solution (~Holly Quetfook... 27.Oct.11) |
|  |
|